Skip to content

Prevent data corruption for StorPool volumes #10799

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 16, 2025

Conversation

slavkap
Copy link
Contributor

@slavkap slavkap commented Apr 30, 2025

Description

Prevents data corruption in cases where a VM is started on a new host while it is already running on another host

Scenario: If the host is down and the admin stops a VM with the option force=true, the admin will be able to start it on another host. This may lead to data corruption when the issue with the first host is fixed, the volume will be attached to both hosts.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

How Has This Been Tested?

Manual testing and with StorPool's smoke tests

Copy link

codecov bot commented Apr 30, 2025

Codecov Report

Attention: Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.

Project coverage is 16.42%. Comparing base (2df1ac5) to head (5c54178).

Files with missing lines Patch % Lines
...n/java/com/cloud/vm/VirtualMachineManagerImpl.java 0.00% 4 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #10799      +/-   ##
============================================
- Coverage     16.42%   16.42%   -0.01%     
- Complexity    13549    13552       +3     
============================================
  Files          5673     5673              
  Lines        500080   500084       +4     
  Branches      60506    60507       +1     
============================================
- Hits          82153    82148       -5     
- Misses       408794   408804      +10     
+ Partials       9133     9132       -1     
Flag Coverage Δ
uitests 4.00% <ø> (ø)
unittests 17.29% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@rohityadavcloud rohityadavcloud added this to the 4.21.0 milestone May 1, 2025
Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes perfect sense

@slavkap
Copy link
Contributor Author

slavkap commented May 14, 2025

Hi @Pearl1594, @DaanHoogland, @rohityadavcloud!
My colleagues asked me if I could backport this to an earlier version. Would it be a problem for you if I rebase this onto 4.19.3.0?

@DaanHoogland
Copy link
Contributor

Hi @Pearl1594, @DaanHoogland, @rohityadavcloud! My colleagues asked me if I could backport this to an earlier version. Would it be a problem for you if I rebase this onto 4.19.3.0?

sure

Prevents data corruption in cases where a VM is started on a new host while it is already running on another host

Scenario - if the host is down and the admin stops a VM with the
option `force=true` the admin will be able to start it on another host.
This may lead to a data corruption when the first host goes Up and the
volume is attached on both hosts.
Copy link

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

@slavkap slavkap changed the base branch from main to 4.19 May 15, 2025 09:53
@DaanHoogland
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13395

@DaanHoogland
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-13322)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 50204 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10799-t13322-kvm-ol8.zip
Smoke tests completed. 133 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

Copy link
Contributor

@harikrishna-patnala harikrishna-patnala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. This method is implemented only in storpool plugin, so I don't see any regressions. Thank you.

@harikrishna-patnala harikrishna-patnala modified the milestones: 4.21.0, 4.19.3 May 16, 2025
@DaanHoogland DaanHoogland merged commit c183fc9 into apache:4.19 May 16, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants